
SELECT source_id, ra, dec
FROM gaiadr1.tgas_source
WHERE phot_g_mean_flux > 13
The gaia archive found here:https://gea.esac.esa.int/archive/. Click on search tab.
Search tab has tree options: Basic, Advenced(ADQL) and Query Results.

#!pip install astroquery
from astroquery.gaia import Gaia
import pandas as pd
C:\Users\2189v\anaconda3\Lib\site-packages\pandas\core\arrays\masked.py:60: UserWarning: Pandas requires version '1.3.6' or newer of 'bottleneck' (version '1.3.5' currently installed). from pandas.core import (
print("Qerying Gaia database...")
query = """
SELECT TOP 10000
source_id,
parallax,
parallax_error,
phot_g_mean_mag,
bp_rp,
ra,
ra_error
FROM gaiadr3.gaia_source
WHERE parallax > 0
ORDER BY parallax DESC
"""
path = 'closest_10000.csv'
job = Gaia.launch_job(query, output_file=path, output_format='csv', dump_to_file=True)
print(f"Job finished. Data saved to {path}. Job Info: {job}")
Qerying Gaia database... Job finished. Data saved to closest_10000.csv. Job Info: Jobid: None Phase: COMPLETED Owner: None Output file: closest_10000.csv Results: None
The dataset you provided contains the following column names and their meanings:
| Column Name | Meaning |
|---|---|
source_id |
Unique identifier for the data source record. |
parallax |
The observed parallax of the star (the apparent shift in position of celestial objects). |
parallax_error |
The measurement error associated with the parallax value. |
phot_g_mean_mag |
The Gaia G-band photometric mean magnitude of the star. |
bp_rp |
The color index between the blue (BP) and red (RP) bands. |
ra |
The right ascension coordinate (in the equatorial coordinate system). |
ra_error |
The error associated with the right ascension coordinate. |
These column names represent the fundamental astronomical properties of the stars in the dataset.
df=pd.read_csv(path)
df.head()
| source_id | parallax | parallax_error | phot_g_mean_mag | bp_rp | ra | ra_error | |
|---|---|---|---|---|---|---|---|
| 0 | 5853498713190525696 | 768.066539 | 0.049873 | 8.984749 | 3.804580 | 217.392321 | 0.023999 |
| 1 | 4472832130942575872 | 546.975940 | 0.040116 | 8.193974 | 2.833697 | 269.448503 | 0.026239 |
| 2 | 3864972938605115520 | 415.179416 | 0.068371 | 11.038391 | 4.184836 | 164.103190 | 0.066837 |
| 3 | 762815470562110464 | 392.752945 | 0.032067 | 6.551172 | 2.215609 | 165.830960 | 0.024126 |
| 4 | 2947050466531873024 | 374.489589 | 0.231335 | 8.524133 | -0.278427 | 101.286626 | 0.164834 |
df.shape
(10000, 7)
df.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 10000 entries, 0 to 9999 Data columns (total 7 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 source_id 10000 non-null int64 1 parallax 10000 non-null float64 2 parallax_error 10000 non-null float64 3 phot_g_mean_mag 9985 non-null float64 4 bp_rp 9632 non-null float64 5 ra 10000 non-null float64 6 ra_error 10000 non-null float64 dtypes: float64(6), int64(1) memory usage: 547.0 KB
import matplotlib.pyplot as plt
import plotly.graph_objects as go
fig = go.Figure()
fig.add_trace(go.Scatter(
x=df['parallax'],
y=df['ra'],
mode='markers',
marker=dict(
size=5,
color=df['bp_rp'],
colorscale='Viridis',
showscale=True,
colorbar_title="bp_rp"
)
))
fig.update_layout(
title='Interactive Scatter Plot',
xaxis_title='Parallax',
yaxis_title='RA',
hovermode='closest'
)
fig.show()
This graph creates a scatter plot using the parallax and ra columns. Each point represents a star in the dataset.
The colors of the points vary based on the values in the bp_rp column. The bp_rp column indicates the difference in brightness between the blue (BP) and red (RP) wavelengths of the stars. This allows an additional attribute to be visualized through the color coding.
The graph is interactive, meaning users can hover over the points to see more information. This type of live visualization can be very useful for data exploration and analysis.
import matplotlib.pyplot as plt
import numpy as np
df['distance'] = 1 / (df['parallax'] * 1e-3)
df['abs_g'] = df['phot_g_mean_mag'] + 5 + 5 * np.log10(df['distance'])
df['abs_mag'] = df['phot_g_mean_mag'] + 5 * (np.log10(df['distance']) - 1)
(bp_rp,abs_mag)
plt.figure(figsize=(10, 8))
plt.scatter(df['bp_rp'], df['abs_mag'], s=1)
plt.gca().invert_yaxis()
plt.xlabel('BP - RP')
plt.ylabel('Absolute G Magnitude')
plt.title('Color-Magnitude Diagram of the 10,000 Closest Stars')
plt.grid(True)
plt.show()
This visual shows the color-magnitude diagram of the 10,000 closest stars. The position of the stars provides information about their surface temperatures (color index) and luminosities (absolute magnitude).
plt.figure(figsize=(10, 8))
plt.scatter(df['parallax'], df['distance'], s=1)
plt.gca().invert_yaxis()
plt.xlabel('Parallax')
plt.ylabel('Distance(parsecs)')
plt.title('Parallax vs Distance')
plt.grid(True)
plt.show()
This plot displays a scatter graph of the parallax and distance values for the stars in the dataset. As the parallax values increase, the distance to the stars decreases. This visualization demonstrates the inverse relationship between parallax and distance, and illustrates the distances to the stars based on their parallax measurements.
In this notebook, we have used the Gaia DR3 dataset to query the 10,000 closest stars to Earth using ADQL. We have performed exploratory data analysis (EDA) on the retrieved data, including visualizations of the relationships between various parameters such as right ascension, parallax, color index, absolute magnitude, and distance. These visualizations provide insights into the properties of stars in our cosmic neighborhood, and demonstrate the power of ADQL queries for accessing and analyzing astronomical data